Mamba MOE Quant Configs + Fix Export Bug by jenchen13 · Pull Request #882 · NVIDIA/Model-Optimizer

jenchen13 · 2026-02-12T17:12:05Z

What does this PR do?

Type of change: ? Bug fix

Overview: ?

Fix a bug in MCore export exclude_modules where the layers had an extra period at the end
Add custom quant configs for mamba moes

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes/No
Did you write any new necessary tests?: Yes/No
Did you add or update any necessary documentation?: Yes/No
Did you update Changelog?: Yes/No

Additional Information

Summary by CodeRabbit

New Features
- Added four new Mamba MOE quantization configurations: aggressive and conservative variants for both FP8 and NVFP4 quantization schemes, providing enhanced flexibility in quantization options for different use cases.
Bug Fixes
- Improved quantization export module exclusion pattern handling to properly normalize trailing dots from exclude patterns during export.

copy-pr-bot · 2026-02-12T17:12:09Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-02-12T17:12:39Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR adds Mamba MOE-specific quantization configuration variants for FP8 and NVFP4 quantizers, and fixes export module exclusion pattern handling by stripping trailing dots from prefix names.

Changes

Cohort / File(s)	Summary
Export Pattern Fix `modelopt/torch/export/unified_export_megatron.py`	Modified exclude module pattern handling in `_get_quantized_state` to strip trailing dots from prefixes before recording exclusions, improving pattern consistency.
Mamba MOE Quantization Configs `modelopt/torch/quantization/config.py`	Introduced new Mamba MOE-specific quantization configurations (MAMBA\_MOE\_FP8\_AGGRESSIVE\_CFG, MAMBA\_MOE\_FP8\_CONSERVATIVE\_CFG, MAMBA\_MOE\_NVFP4\_AGGRESSIVE\_CFG, MAMBA\_MOE\_NVFP4\_CONSERVATIVE\_CFG) with disabled MOE-related quantizers (fc1\_latent\_proj, fc2\_latent\_proj, q/k/v\_proj). Configurations extend existing families with selective mixer projection disabling in conservative NVFP4 variants.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly addresses both main changes: the Mamba MOE quantization configurations and the export bug fix, accurately summarizing the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch jennifchen/nmh_configs

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-02-12T17:24:29Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.74%. Comparing base (95511a0) to head (c9fb020).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #882   +/-   ##
=======================================
  Coverage   73.73%   73.74%           
=======================================
  Files         199      199           
  Lines       21165    21170    +5     
=======================================
+ Hits        15606    15611    +5     
  Misses       5559     5559

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ChenhanYu

The formatting test is failing.

Fridah-nv

LGTM

Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>

## What does this PR do? **Type of change:** ? Bug fix **Overview:** ? - Fix a bug in MCore export `exclude_modules` where the layers had an extra period at the end - Add custom quant configs for mamba moes ## Usage  ```python # Add a code snippet demonstrating how to use this ``` ## Testing  ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes/No  - **Did you write any new necessary tests?**: Yes/No - **Did you add or update any necessary documentation?**: Yes/No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No  ## Additional Information   ## Summary by CodeRabbit * **New Features** * Added four new Mamba MOE quantization configurations: aggressive and conservative variants for both FP8 and NVFP4 quantization schemes, providing enhanced flexibility in quantization options for different use cases. * **Bug Fixes** * Improved quantization export module exclusion pattern handling to properly normalize trailing dots from exclude patterns during export.  --------- Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>

## What does this PR do? **Type of change:** ? Bug fix **Overview:** ? - Fix a bug in MCore export `exclude_modules` where the layers had an extra period at the end - Add custom quant configs for mamba moes ## Usage  ```python # Add a code snippet demonstrating how to use this ``` ## Testing  ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes/No  - **Did you write any new necessary tests?**: Yes/No - **Did you add or update any necessary documentation?**: Yes/No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No  ## Additional Information   ## Summary by CodeRabbit * **New Features** * Added four new Mamba MOE quantization configurations: aggressive and conservative variants for both FP8 and NVFP4 quantization schemes, providing enhanced flexibility in quantization options for different use cases. * **Bug Fixes** * Improved quantization export module exclusion pattern handling to properly normalize trailing dots from exclude patterns during export.  --------- Signed-off-by: Jennifer Chen <jennifchen@nvidia.com> Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

jenchen13 requested review from a team as code owners February 12, 2026 17:12

jenchen13 requested review from ChenhanYu and Fridah-nv February 12, 2026 17:12

ChenhanYu approved these changes Feb 12, 2026

View reviewed changes

ChenhanYu requested a review from meenchen February 12, 2026 17:36

meenchen approved these changes Feb 12, 2026

View reviewed changes

jenchen13 force-pushed the jennifchen/nmh_configs branch from 700c32d to 5b983e3 Compare February 12, 2026 18:39

jenchen13 enabled auto-merge (squash) February 12, 2026 18:39

ChenhanYu disabled auto-merge February 13, 2026 00:12

Fridah-nv approved these changes Feb 13, 2026

View reviewed changes

jenchen13 added 4 commits February 13, 2026 17:25

mamba configs

5bf3432

Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>

change assert order

4d0f1f6

Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>

add o_proj

9fbe89c

Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>

lint

c9fb020

Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>

jenchen13 force-pushed the jennifchen/nmh_configs branch from 801881e to c9fb020 Compare February 13, 2026 17:25

jenchen13 enabled auto-merge (squash) February 13, 2026 17:28

jenchen13 merged commit 590f9fc into main Feb 17, 2026
52 of 56 checks passed

jenchen13 deleted the jennifchen/nmh_configs branch February 17, 2026 16:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mamba MOE Quant Configs + Fix Export Bug#882

Mamba MOE Quant Configs + Fix Export Bug#882
jenchen13 merged 4 commits intomainfrom
jennifchen/nmh_configs

jenchen13 commented Feb 12, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Feb 12, 2026

Uh oh!

coderabbitai bot commented Feb 12, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Estimated code review effort

Uh oh!

codecov bot commented Feb 12, 2026 •

edited

Loading

Uh oh!

ChenhanYu left a comment

Uh oh!

Fridah-nv left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jenchen13 commented Feb 12, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Feb 12, 2026

Uh oh!

coderabbitai bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Uh oh!

codecov bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ChenhanYu left a comment

Choose a reason for hiding this comment

Uh oh!

Fridah-nv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jenchen13 commented Feb 12, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 12, 2026 •

edited

Loading

codecov bot commented Feb 12, 2026 •

edited

Loading